Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 682 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 86.7 KiB |
| Average record size in memory | 130.2 B |
Variable types
| Numeric | 13 |
|---|---|
| Categorical | 3 |
| Boolean | 2 |
amount is highly correlated with duration and 1 other fields | High correlation |
duration is highly correlated with amount | High correlation |
payments is highly correlated with amount and 1 other fields | High correlation |
Average_order_amount is highly correlated with payments and 1 other fields | High correlation |
Average_trans_amount is highly correlated with Average_order_amount and 1 other fields | High correlation |
Average_trans_balance is highly correlated with Average_trans_amount | High correlation |
No_inhabitants is highly correlated with Average_salary and 1 other fields | High correlation |
Average_salary is highly correlated with No_inhabitants and 1 other fields | High correlation |
Average_crime_rate is highly correlated with No_inhabitants and 1 other fields | High correlation |
amount is highly correlated with duration and 1 other fields | High correlation |
duration is highly correlated with amount | High correlation |
payments is highly correlated with amount and 1 other fields | High correlation |
Average_order_amount is highly correlated with payments and 1 other fields | High correlation |
Average_trans_amount is highly correlated with Average_order_amount and 1 other fields | High correlation |
Average_trans_balance is highly correlated with Average_trans_amount | High correlation |
No_inhabitants is highly correlated with Average_salary and 1 other fields | High correlation |
Average_salary is highly correlated with No_inhabitants and 1 other fields | High correlation |
Average_crime_rate is highly correlated with No_inhabitants and 1 other fields | High correlation |
amount is highly correlated with payments | High correlation |
payments is highly correlated with amount and 1 other fields | High correlation |
Average_order_amount is highly correlated with payments | High correlation |
Average_trans_amount is highly correlated with Average_trans_balance | High correlation |
Average_trans_balance is highly correlated with Average_trans_amount | High correlation |
No_inhabitants is highly correlated with Average_salary | High correlation |
Average_salary is highly correlated with No_inhabitants and 1 other fields | High correlation |
Average_crime_rate is highly correlated with Average_salary | High correlation |
amount is highly correlated with duration and 3 other fields | High correlation |
duration is highly correlated with amount | High correlation |
payments is highly correlated with amount and 2 other fields | High correlation |
Average_order_amount is highly correlated with amount and 2 other fields | High correlation |
Average_trans_amount is highly correlated with amount and 3 other fields | High correlation |
Average_trans_balance is highly correlated with Average_trans_amount | High correlation |
No_transaction is highly correlated with Same_district and 1 other fields | High correlation |
No_inhabitants is highly correlated with Average_salary and 2 other fields | High correlation |
Average_salary is highly correlated with No_inhabitants and 2 other fields | High correlation |
Average_unemployment_rate is highly correlated with No_inhabitants and 2 other fields | High correlation |
Average_crime_rate is highly correlated with No_inhabitants and 2 other fields | High correlation |
Same_district is highly correlated with No_transaction | High correlation |
Default is highly correlated with No_transaction | High correlation |
account_id has unique values | Unique |
Average_trans_amount has unique values | Unique |
Average_trans_balance has unique values | Unique |
Reproduction
| Analysis started | 2022-04-13 09:23:06.035323 |
|---|---|
| Analysis finished | 2022-04-13 09:23:45.461534 |
| Duration | 39.43 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 682 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5824.162757 |
| Minimum | 2 |
|---|---|
| Maximum | 11362 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 677.15 |
| Q1 | 2967 |
| median | 5738.5 |
| Q3 | 8686 |
| 95-th percentile | 10811.75 |
| Maximum | 11362 |
| Range | 11360 |
| Interquartile range (IQR) | 5719 |
Descriptive statistics
| Standard deviation | 3283.512681 |
|---|---|
| Coefficient of variation (CV) | 0.5637741969 |
| Kurtosis | -1.214471838 |
| Mean | 5824.162757 |
| Median Absolute Deviation (MAD) | 2847.5 |
| Skewness | -0.04138031497 |
| Sum | 3972079 |
| Variance | 10781455.53 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5270 | 1 | 0.1% |
| 7565 | 1 | 0.1% |
| 10478 | 1 | 0.1% |
| 2187 | 1 | 0.1% |
| 718 | 1 | 0.1% |
| 103 | 1 | 0.1% |
| 666 | 1 | 0.1% |
| 7890 | 1 | 0.1% |
| 2936 | 1 | 0.1% |
| 10065 | 1 | 0.1% |
| Other values (672) | 672 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 19 | 1 | |
| 25 | 1 | |
| 37 | 1 | |
| 38 | 1 | |
| 67 | 1 | |
| 97 | 1 | |
| 103 | 1 | |
| 105 | 1 | |
| 110 | 1 |
| Value | Count | Frequency (%) |
| 11362 | 1 | |
| 11359 | 1 | |
| 11349 | 1 | |
| 11328 | 1 | |
| 11327 | 1 | |
| 11317 | 1 | |
| 11271 | 1 | |
| 11265 | 1 | |
| 11244 | 1 | |
| 11231 | 1 |
| Distinct | 645 |
|---|---|
| Distinct (%) | 94.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 151410.176 |
| Minimum | 4980 |
|---|---|
| Maximum | 590820 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 4980 |
|---|---|
| 5-th percentile | 22972.2 |
| Q1 | 66732 |
| median | 116928 |
| Q3 | 210654 |
| 95-th percentile | 388365.6 |
| Maximum | 590820 |
| Range | 585840 |
| Interquartile range (IQR) | 143922 |
Descriptive statistics
| Standard deviation | 113372.4063 |
|---|---|
| Coefficient of variation (CV) | 0.7487766631 |
| Kurtosis | 0.8378338014 |
| Mean | 151410.176 |
| Median Absolute Deviation (MAD) | 67650 |
| Skewness | 1.114209923 |
| Sum | 103261740 |
| Variance | 1.285330251 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30276 | 4 | 0.6% |
| 86184 | 3 | 0.4% |
| 265320 | 2 | 0.3% |
| 232560 | 2 | 0.3% |
| 67464 | 2 | 0.3% |
| 174744 | 2 | 0.3% |
| 272220 | 2 | 0.3% |
| 155616 | 2 | 0.3% |
| 125472 | 2 | 0.3% |
| 49320 | 2 | 0.3% |
| Other values (635) | 659 |
| Value | Count | Frequency (%) |
| 4980 | 1 | |
| 5148 | 1 | |
| 7656 | 1 | |
| 8616 | 1 | |
| 10944 | 1 | |
| 11400 | 1 | |
| 11736 | 1 | |
| 12540 | 1 | |
| 12792 | 1 | |
| 14028 | 1 |
| Value | Count | Frequency (%) |
| 590820 | 1 | |
| 566640 | 1 | |
| 541200 | 1 | |
| 538500 | 1 | |
| 504000 | 1 | |
| 495180 | 1 | |
| 482940 | 1 | |
| 475680 | 1 | |
| 473280 | 1 | |
| 468060 | 1 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.5 KiB |
| 60 | |
|---|---|
| 24 | |
| 48 | |
| 12 | |
| 36 |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 24 |
|---|---|
| 2nd row | 12 |
| 3rd row | 36 |
| 4th row | 12 |
| 5th row | 12 |
Common Values
| Value | Count | Frequency (%) |
| 60 | 145 | |
| 24 | 138 | |
| 48 | 138 | |
| 12 | 131 | |
| 36 | 130 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 60 | 145 | |
| 24 | 138 | |
| 48 | 138 | |
| 12 | 131 | |
| 36 | 130 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 577 |
|---|---|
| Distinct (%) | 84.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4190.664223 |
| Minimum | 304 |
|---|---|
| Maximum | 9910 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 304 |
|---|---|
| 5-th percentile | 963.9 |
| Q1 | 2477 |
| median | 3934 |
| Q3 | 5813.5 |
| 95-th percentile | 8048.2 |
| Maximum | 9910 |
| Range | 9606 |
| Interquartile range (IQR) | 3336.5 |
Descriptive statistics
| Standard deviation | 2215.830344 |
|---|---|
| Coefficient of variation (CV) | 0.5287539699 |
| Kurtosis | -0.6975693672 |
| Mean | 4190.664223 |
| Median Absolute Deviation (MAD) | 1688 |
| Skewness | 0.3543746397 |
| Sum | 2858033 |
| Variance | 4909904.115 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2523 | 4 | 0.6% |
| 3151 | 4 | 0.6% |
| 4363 | 3 | 0.4% |
| 3874 | 3 | 0.4% |
| 2307 | 3 | 0.4% |
| 5354 | 3 | 0.4% |
| 7370 | 3 | 0.4% |
| 2779 | 3 | 0.4% |
| 3698 | 3 | 0.4% |
| 4537 | 3 | 0.4% |
| Other values (567) | 650 |
| Value | Count | Frequency (%) |
| 304 | 1 | |
| 312 | 1 | |
| 319 | 1 | |
| 334 | 1 | |
| 359 | 1 | |
| 371 | 1 | |
| 403 | 1 | |
| 415 | 1 | |
| 424 | 1 | |
| 429 | 1 |
| Value | Count | Frequency (%) |
| 9910 | 1 | |
| 9847 | 1 | |
| 9736 | 1 | |
| 9721 | 1 | |
| 9698 | 1 | |
| 9689 | 1 | |
| 9444 | 1 | |
| 9268 | 1 | |
| 9112 | 1 | |
| 9020 | 1 |
DID
Real number (ℝ≥0)
| Distinct | 400 |
|---|---|
| Distinct (%) | 58.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 398.2404692 |
| Minimum | 102 |
|---|---|
| Maximum | 697 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 102 |
|---|---|
| 5-th percentile | 136.05 |
| Q1 | 261.25 |
| median | 395.5 |
| Q3 | 528.75 |
| 95-th percentile | 664 |
| Maximum | 697 |
| Range | 595 |
| Interquartile range (IQR) | 267.5 |
Descriptive statistics
| Standard deviation | 164.611359 |
|---|---|
| Coefficient of variation (CV) | 0.4133466378 |
| Kurtosis | -1.058438529 |
| Mean | 398.2404692 |
| Median Absolute Deviation (MAD) | 134 |
| Skewness | 0.02740908412 |
| Sum | 271600 |
| Variance | 27096.89951 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 561 | 6 | 0.9% |
| 333 | 5 | 0.7% |
| 313 | 4 | 0.6% |
| 440 | 4 | 0.6% |
| 415 | 4 | 0.6% |
| 254 | 4 | 0.6% |
| 304 | 4 | 0.6% |
| 125 | 4 | 0.6% |
| 684 | 4 | 0.6% |
| 239 | 4 | 0.6% |
| Other values (390) | 639 |
| Value | Count | Frequency (%) |
| 102 | 1 | |
| 103 | 1 | |
| 105 | 2 | |
| 107 | 1 | |
| 108 | 2 | |
| 110 | 1 | |
| 111 | 1 | |
| 114 | 1 | |
| 115 | 2 | |
| 117 | 1 |
| Value | Count | Frequency (%) |
| 697 | 1 | 0.1% |
| 696 | 1 | 0.1% |
| 694 | 1 | 0.1% |
| 693 | 3 | |
| 691 | 2 | |
| 689 | 2 | |
| 686 | 1 | 0.1% |
| 685 | 1 | 0.1% |
| 684 | 4 | |
| 683 | 1 | 0.1% |
Average_order_amount
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 651 |
|---|---|
| Distinct (%) | 95.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4486.950975 |
| Minimum | 312 |
|---|---|
| Maximum | 10817 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 312 |
|---|---|
| 5-th percentile | 1537.4125 |
| Q1 | 2823.516667 |
| median | 4193.616667 |
| Q3 | 5863.05 |
| 95-th percentile | 8252.44 |
| Maximum | 10817 |
| Range | 10505 |
| Interquartile range (IQR) | 3039.533333 |
Descriptive statistics
| Standard deviation | 2134.790905 |
|---|---|
| Coefficient of variation (CV) | 0.475777631 |
| Kurtosis | -0.4391549935 |
| Mean | 4486.950975 |
| Median Absolute Deviation (MAD) | 1500.533333 |
| Skewness | 0.5143048559 |
| Sum | 3060100.565 |
| Variance | 4557332.21 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5353.5 | 3 | 0.4% |
| 3162.5 | 2 | 0.3% |
| 8033.2 | 2 | 0.3% |
| 3475.166667 | 2 | 0.3% |
| 3216.7 | 2 | 0.3% |
| 1860 | 2 | 0.3% |
| 4845 | 2 | 0.3% |
| 3151.3 | 2 | 0.3% |
| 3252.5 | 2 | 0.3% |
| 5577.3 | 2 | 0.3% |
| Other values (641) | 661 |
| Value | Count | Frequency (%) |
| 312 | 1 | |
| 519 | 1 | |
| 529 | 1 | |
| 708 | 1 | |
| 763.5 | 1 | |
| 811.6 | 1 | |
| 846.6 | 1 | |
| 850 | 1 | |
| 1041.5 | 1 | |
| 1054 | 1 |
| Value | Count | Frequency (%) |
| 10817 | 1 | |
| 10370 | 1 | |
| 10054.85 | 1 | |
| 10038.15 | 1 | |
| 9967.65 | 1 | |
| 9794.5 | 1 | |
| 9721 | 1 | |
| 9689 | 1 | |
| 9679 | 1 | |
| 9354.4 | 1 |
Average_trans_amount
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 682 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8460.899856 |
| Minimum | 1315.619048 |
|---|---|
| Maximum | 19772.47368 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 1315.619048 |
|---|---|
| 5-th percentile | 3094.60153 |
| Q1 | 5490.948511 |
| median | 8331.56015 |
| Q3 | 11346.23043 |
| 95-th percentile | 14409.20299 |
| Maximum | 19772.47368 |
| Range | 18456.85464 |
| Interquartile range (IQR) | 5855.281915 |
Descriptive statistics
| Standard deviation | 3664.134172 |
|---|---|
| Coefficient of variation (CV) | 0.4330667227 |
| Kurtosis | -0.7154212983 |
| Mean | 8460.899856 |
| Median Absolute Deviation (MAD) | 2947.734174 |
| Skewness | 0.2338828596 |
| Sum | 5770333.702 |
| Variance | 13425879.23 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 15016.19481 | 1 | 0.1% |
| 6117.401786 | 1 | 0.1% |
| 11961.10363 | 1 | 0.1% |
| 13587.45045 | 1 | 0.1% |
| 9901.736655 | 1 | 0.1% |
| 9268.962162 | 1 | 0.1% |
| 3887.442029 | 1 | 0.1% |
| 2988.176471 | 1 | 0.1% |
| 4320.055276 | 1 | 0.1% |
| 15935.4981 | 1 | 0.1% |
| Other values (672) | 672 |
| Value | Count | Frequency (%) |
| 1315.619048 | 1 | |
| 1522.287402 | 1 | |
| 1540.115044 | 1 | |
| 1556.781737 | 1 | |
| 1894.163121 | 1 | |
| 2002.994485 | 1 | |
| 2022.054422 | 1 | |
| 2054.305296 | 1 | |
| 2073.11 | 1 | |
| 2147.04 | 1 |
| Value | Count | Frequency (%) |
| 19772.47368 | 1 | |
| 17747.78947 | 1 | |
| 17625.825 | 1 | |
| 17554.81034 | 1 | |
| 17553.50943 | 1 | |
| 17180.59639 | 1 | |
| 17135.3617 | 1 | |
| 16946.40523 | 1 | |
| 16527.97538 | 1 | |
| 16430.53659 | 1 |
Average_trans_balance
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 682 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 44935.27845 |
| Minimum | 6690.547112 |
|---|---|
| Maximum | 79272.1245 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 6690.547112 |
|---|---|
| 5-th percentile | 20689.56502 |
| Q1 | 34685.44948 |
| median | 45614.40178 |
| Q3 | 55426.2194 |
| 95-th percentile | 67146.79671 |
| Maximum | 79272.1245 |
| Range | 72581.57739 |
| Interquartile range (IQR) | 20740.76992 |
Descriptive statistics
| Standard deviation | 14082.25188 |
|---|---|
| Coefficient of variation (CV) | 0.3133896655 |
| Kurtosis | -0.6005487629 |
| Mean | 44935.27845 |
| Median Absolute Deviation (MAD) | 10513.49402 |
| Skewness | -0.111166485 |
| Sum | 30645859.91 |
| Variance | 198309818.1 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 65773.45671 | 1 | 0.1% |
| 31061.45089 | 1 | 0.1% |
| 55270.9171 | 1 | 0.1% |
| 65723.25676 | 1 | 0.1% |
| 48722.7331 | 1 | 0.1% |
| 48288.84865 | 1 | 0.1% |
| 47634.97464 | 1 | 0.1% |
| 29862.17647 | 1 | 0.1% |
| 26998.24623 | 1 | 0.1% |
| 73529.36122 | 1 | 0.1% |
| Other values (672) | 672 |
| Value | Count | Frequency (%) |
| 6690.547112 | 1 | |
| 9629.374396 | 1 | |
| 10176.70698 | 1 | |
| 13352.88889 | 1 | |
| 13411.63348 | 1 | |
| 14818.53906 | 1 | |
| 15259.625 | 1 | |
| 15865.94382 | 1 | |
| 16124.0557 | 1 | |
| 16253.06931 | 1 |
| Value | Count | Frequency (%) |
| 79272.1245 | 1 | |
| 77371.25275 | 1 | |
| 76453.66923 | 1 | |
| 76110.125 | 1 | |
| 75639.62766 | 1 | |
| 75595.65323 | 1 | |
| 73529.36122 | 1 | |
| 73384.10023 | 1 | |
| 73271.69919 | 1 | |
| 73259.46462 | 1 |
| Distinct | 356 |
|---|---|
| Distinct (%) | 52.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 280.8739003 |
| Minimum | 49 |
|---|---|
| Maximum | 675 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 49 |
|---|---|
| 5-th percentile | 99.05 |
| Q1 | 173.25 |
| median | 250.5 |
| Q3 | 392 |
| 95-th percentile | 533 |
| Maximum | 675 |
| Range | 626 |
| Interquartile range (IQR) | 218.75 |
Descriptive statistics
| Standard deviation | 136.8920234 |
|---|---|
| Coefficient of variation (CV) | 0.4873789386 |
| Kurtosis | -0.4993678567 |
| Mean | 280.8739003 |
| Median Absolute Deviation (MAD) | 97 |
| Skewness | 0.5778148295 |
| Sum | 191556 |
| Variance | 18739.42607 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 303 | 6 | 0.9% |
| 149 | 6 | 0.9% |
| 196 | 6 | 0.9% |
| 274 | 6 | 0.9% |
| 177 | 5 | 0.7% |
| 128 | 5 | 0.7% |
| 161 | 5 | 0.7% |
| 192 | 5 | 0.7% |
| 339 | 5 | 0.7% |
| 198 | 5 | 0.7% |
| Other values (346) | 628 |
| Value | Count | Frequency (%) |
| 49 | 1 | |
| 57 | 1 | |
| 59 | 1 | |
| 63 | 1 | |
| 65 | 1 | |
| 72 | 1 | |
| 73 | 2 | |
| 74 | 1 | |
| 75 | 1 | |
| 76 | 2 |
| Value | Count | Frequency (%) |
| 675 | 1 | |
| 665 | 1 | |
| 649 | 1 | |
| 643 | 1 | |
| 637 | 1 | |
| 634 | 1 | |
| 633 | 1 | |
| 628 | 1 | |
| 611 | 1 | |
| 609 | 1 |
Card_type
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.5 KiB |
| No | |
|---|---|
| classic | |
| junior | 21 |
| gold | 16 |
Length
| Max length | 7 |
|---|---|
| Median length | 2 |
| Mean length | 3.14516129 |
| Min length | 2 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No |
|---|---|
| 2nd row | No |
| 3rd row | No |
| 4th row | No |
| 5th row | No |
Common Values
| Value | Count | Frequency (%) |
| No | 512 | |
| classic | 133 | 19.5% |
| junior | 21 | 3.1% |
| gold | 16 | 2.3% |
Length
Pie chart
| Value | Count | Frequency (%) |
| no | 512 | |
| classic | 133 | 19.5% |
| junior | 21 | 3.1% |
| gold | 16 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
No_inhabitants
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 77 |
|---|---|
| Distinct (%) | 11.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 272052.2361 |
| Minimum | 42821 |
|---|---|
| Maximum | 1204953 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 42821 |
|---|---|
| 5-th percentile | 51428 |
| Q1 | 92084 |
| median | 124605 |
| Q3 | 226122 |
| 95-th percentile | 1204953 |
| Maximum | 1204953 |
| Range | 1162132 |
| Interquartile range (IQR) | 134038 |
Descriptive statistics
| Standard deviation | 358331.9752 |
|---|---|
| Coefficient of variation (CV) | 1.317144017 |
| Kurtosis | 2.745126016 |
| Mean | 272052.2361 |
| Median Absolute Deviation (MAD) | 45747 |
| Skewness | 2.104797798 |
| Sum | 185539625 |
| Variance | 1.284018044 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1204953 | 84 | 12.3% |
| 387570 | 24 | 3.5% |
| 285387 | 24 | 3.5% |
| 323870 | 20 | 2.9% |
| 197099 | 17 | 2.5% |
| 228848 | 16 | 2.3% |
| 226122 | 14 | 2.1% |
| 139012 | 14 | 2.1% |
| 51428 | 14 | 2.1% |
| 85852 | 13 | 1.9% |
| Other values (67) | 442 |
| Value | Count | Frequency (%) |
| 42821 | 8 | |
| 45714 | 8 | |
| 51313 | 8 | |
| 51428 | 14 | |
| 53921 | 8 | |
| 58400 | 2 | 0.3% |
| 58796 | 7 | |
| 67298 | 9 | |
| 70646 | 6 | |
| 70699 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 1204953 | 84 | |
| 387570 | 24 | 3.5% |
| 323870 | 20 | 2.9% |
| 285387 | 24 | 3.5% |
| 228848 | 16 | 2.3% |
| 226122 | 14 | 2.1% |
| 197099 | 17 | 2.5% |
| 182027 | 8 | 1.2% |
| 177686 | 8 | 1.2% |
| 170449 | 6 | 0.9% |
Average_salary
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 76 |
|---|---|
| Distinct (%) | 11.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9502.986804 |
| Minimum | 8110 |
|---|---|
| Maximum | 12541 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 8110 |
|---|---|
| 5-th percentile | 8240 |
| Q1 | 8544 |
| median | 8991 |
| Q3 | 9897 |
| 95-th percentile | 12541 |
| Maximum | 12541 |
| Range | 4431 |
| Interquartile range (IQR) | 1353 |
Descriptive statistics
| Standard deviation | 1323.150982 |
|---|---|
| Coefficient of variation (CV) | 0.1392352751 |
| Kurtosis | 0.6849555243 |
| Mean | 9502.986804 |
| Median Absolute Deviation (MAD) | 601 |
| Skewness | 1.332946199 |
| Sum | 6481037 |
| Variance | 1750728.521 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12541 | 84 | 12.3% |
| 9897 | 24 | 3.5% |
| 10177 | 24 | 3.5% |
| 10673 | 20 | 2.9% |
| 9624 | 17 | 2.5% |
| 9893 | 16 | 2.3% |
| 8994 | 14 | 2.1% |
| 8363 | 14 | 2.1% |
| 8402 | 14 | 2.1% |
| 8965 | 13 | 1.9% |
| Other values (66) | 442 |
| Value | Count | Frequency (%) |
| 8110 | 6 | |
| 8114 | 4 | 0.6% |
| 8173 | 8 | |
| 8187 | 12 | |
| 8208 | 3 | 0.4% |
| 8240 | 7 | |
| 8254 | 8 | |
| 8288 | 7 | |
| 8363 | 14 | |
| 8369 | 11 |
| Value | Count | Frequency (%) |
| 12541 | 84 | |
| 11277 | 5 | 0.7% |
| 10787 | 6 | 0.9% |
| 10673 | 20 | 2.9% |
| 10446 | 5 | 0.7% |
| 10177 | 24 | 3.5% |
| 10124 | 5 | 0.7% |
| 10045 | 8 | 1.2% |
| 9920 | 6 | 0.9% |
| 9897 | 24 | 3.5% |
| Distinct | 76 |
|---|---|
| Distinct (%) | 11.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.192917889 |
| Minimum | 0.36 |
|---|---|
| Maximum | 8.37 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 0.36 |
|---|---|
| 5-th percentile | 0.36 |
| Q1 | 1.78 |
| median | 3.09 |
| Q3 | 4.295 |
| 95-th percentile | 7.19 |
| Maximum | 8.37 |
| Range | 8.01 |
| Interquartile range (IQR) | 2.515 |
Descriptive statistics
| Standard deviation | 1.98404887 |
|---|---|
| Coefficient of variation (CV) | 0.6213905083 |
| Kurtosis | -0.3700318927 |
| Mean | 3.192917889 |
| Median Absolute Deviation (MAD) | 1.31 |
| Skewness | 0.4705602496 |
| Sum | 2177.57 |
| Variance | 3.936449917 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.36 | 84 | 12.3% |
| 1.78 | 24 | 3.5% |
| 7.19 | 24 | 3.5% |
| 5.095 | 20 | 2.9% |
| 1.955 | 17 | 2.5% |
| 4.405 | 16 | 2.3% |
| 4.295 | 14 | 2.1% |
| 3.005 | 14 | 2.1% |
| 3.555 | 14 | 2.1% |
| 7.655 | 13 | 1.9% |
| Other values (66) | 442 |
| Value | Count | Frequency (%) |
| 0.36 | 84 | |
| 0.52 | 11 | 1.6% |
| 0.55 | 5 | 0.7% |
| 0.97 | 9 | 1.3% |
| 1.115 | 1 | 0.1% |
| 1.175 | 2 | 0.3% |
| 1.33 | 11 | 1.6% |
| 1.345 | 5 | 0.7% |
| 1.565 | 8 | 1.2% |
| 1.575 | 10 | 1.5% |
| Value | Count | Frequency (%) |
| 8.37 | 5 | 0.7% |
| 7.655 | 13 | |
| 7.19 | 24 | |
| 7.055 | 9 | 1.3% |
| 6.78 | 6 | 0.9% |
| 6.68 | 4 | 0.6% |
| 6.16 | 6 | 0.9% |
| 5.75 | 6 | 0.9% |
| 5.73 | 6 | 0.9% |
| 5.52 | 4 | 0.6% |
Average_crime_rate
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 77 |
|---|---|
| Distinct (%) | 11.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.03603794572 |
| Minimum | 0.01474440334 |
|---|---|
| Maximum | 0.07667684964 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 0.01474440334 |
|---|---|
| 5-th percentile | 0.01734219529 |
| Q1 | 0.0219757665 |
| median | 0.03166882101 |
| Q3 | 0.04270287024 |
| 95-th percentile | 0.07667684964 |
| Maximum | 0.07667684964 |
| Range | 0.0619324463 |
| Interquartile range (IQR) | 0.02072710374 |
Descriptive statistics
| Standard deviation | 0.01831162415 |
|---|---|
| Coefficient of variation (CV) | 0.5081206428 |
| Kurtosis | 0.3181398925 |
| Mean | 0.03603794572 |
| Median Absolute Deviation (MAD) | 0.01030393443 |
| Skewness | 1.165048596 |
| Sum | 24.57787898 |
| Variance | 0.0003353155789 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.07667684964 | 84 | 12.3% |
| 0.04827128003 | 24 | 3.5% |
| 0.03501561038 | 24 | 3.5% |
| 0.05732083861 | 20 | 2.9% |
| 0.02219443021 | 17 | 2.5% |
| 0.02514769629 | 16 | 2.3% |
| 0.04174737531 | 14 | 2.1% |
| 0.01928970161 | 14 | 2.1% |
| 0.02039744886 | 14 | 2.1% |
| 0.03188626939 | 13 | 1.9% |
| Other values (67) | 442 |
| Value | Count | Frequency (%) |
| 0.01474440334 | 4 | |
| 0.01585670582 | 8 | |
| 0.01601830664 | 9 | |
| 0.01728495136 | 7 | |
| 0.01733971452 | 7 | |
| 0.0173893301 | 7 | |
| 0.01746558321 | 7 | |
| 0.01768916936 | 6 | |
| 0.01865949162 | 8 | |
| 0.01866814891 | 9 |
| Value | Count | Frequency (%) |
| 0.07667684964 | 84 | |
| 0.05732083861 | 20 | 2.9% |
| 0.05393855664 | 6 | 0.9% |
| 0.05021491783 | 3 | 0.4% |
| 0.04827128003 | 24 | 3.5% |
| 0.04821852732 | 6 | 0.9% |
| 0.04630009123 | 11 | 1.6% |
| 0.04535760496 | 5 | 0.7% |
| 0.04412275748 | 5 | 0.7% |
| 0.04270287024 | 8 | 1.2% |
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.5 KiB |
| female | |
|---|---|
| male |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.020527859 |
| Min length | 4 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | male |
|---|---|
| 2nd row | male |
| 3rd row | male |
| 4th row | female |
| 5th row | male |
Common Values
| Value | Count | Frequency (%) |
| female | 348 | |
| male | 334 |
Length
Pie chart
| Value | Count | Frequency (%) |
| female | 348 | |
| male | 334 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
Owner_age
Real number (ℝ≥0)
| Distinct | 391 |
|---|---|
| Distinct (%) | 57.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 457.2800587 |
| Minimum | 162 |
|---|---|
| Maximum | 742 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.5 KiB |
Quantile statistics
| Minimum | 162 |
|---|---|
| 5-th percentile | 220 |
| Q1 | 326.5 |
| median | 451.5 |
| Q3 | 587 |
| 95-th percentile | 693 |
| Maximum | 742 |
| Range | 580 |
| Interquartile range (IQR) | 260.5 |
Descriptive statistics
| Standard deviation | 153.3584071 |
|---|---|
| Coefficient of variation (CV) | 0.3353708613 |
| Kurtosis | -1.129721745 |
| Mean | 457.2800587 |
| Median Absolute Deviation (MAD) | 129.5 |
| Skewness | 0.0191246409 |
| Sum | 311865 |
| Variance | 23518.80104 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 644 | 6 | 0.9% |
| 382 | 6 | 0.9% |
| 362 | 6 | 0.9% |
| 535 | 5 | 0.7% |
| 603 | 5 | 0.7% |
| 239 | 5 | 0.7% |
| 561 | 5 | 0.7% |
| 276 | 5 | 0.7% |
| 284 | 5 | 0.7% |
| 560 | 4 | 0.6% |
| Other values (381) | 630 |
| Value | Count | Frequency (%) |
| 162 | 1 | |
| 170 | 1 | |
| 171 | 1 | |
| 175 | 1 | |
| 176 | 1 | |
| 179 | 1 | |
| 183 | 1 | |
| 186 | 1 | |
| 187 | 2 | |
| 190 | 1 |
| Value | Count | Frequency (%) |
| 742 | 1 | 0.1% |
| 741 | 2 | |
| 735 | 1 | 0.1% |
| 734 | 2 | |
| 731 | 1 | 0.1% |
| 729 | 1 | 0.1% |
| 728 | 1 | 0.1% |
| 727 | 1 | 0.1% |
| 724 | 1 | 0.1% |
| 722 | 3 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 810.0 B |
| True | |
|---|---|
| False | 60 |
| Value | Count | Frequency (%) |
| True | 622 | |
| False | 60 | 8.8% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| account_id | amount | duration | payments | DID | Average_order_amount | Average_trans_amount | Average_trans_balance | No_transaction | Card_type | No_inhabitants | Average_salary | Average_unemployment_rate | Average_crime_rate | gender | Owner_age | Same_district | Default | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5270 | 79608 | 24 | 3317.0 | 313 | 3317.000000 | 15016.194805 | 65773.456710 | 462 | No | 105606 | 8254 | 3.275 | 0.021263 | male | 282.0 | True | False |
| 1 | 11265 | 52788 | 12 | 4399.0 | 244 | 2504.000000 | 2654.845041 | 22010.194215 | 484 | No | 58796 | 9045 | 3.365 | 0.031669 | male | 246.0 | True | False |
| 2 | 10364 | 21924 | 36 | 609.0 | 297 | 3435.500000 | 4808.558522 | 46258.338809 | 487 | No | 157042 | 8743 | 2.155 | 0.024048 | male | 398.0 | True | False |
| 3 | 3834 | 23052 | 12 | 1921.0 | 617 | 1772.200000 | 3333.601504 | 24129.730827 | 665 | No | 387570 | 9897 | 1.780 | 0.048271 | female | 646.0 | True | False |
| 4 | 9307 | 41904 | 12 | 3492.0 | 603 | 3626.833333 | 7649.588598 | 58766.853621 | 649 | No | 228848 | 9893 | 4.405 | 0.025148 | male | 242.0 | False | False |
| 5 | 5891 | 65184 | 12 | 5432.0 | 448 | 5432.300000 | 11066.318966 | 70233.519397 | 464 | gold | 387570 | 9897 | 1.780 | 0.048271 | male | 438.0 | False | False |
| 6 | 6473 | 76908 | 12 | 6409.0 | 485 | 6408.800000 | 11575.716702 | 46888.488372 | 473 | No | 107870 | 8754 | 4.070 | 0.035561 | female | 580.0 | False | True |
| 7 | 1843 | 105804 | 36 | 2939.0 | 185 | 4966.350000 | 5983.716450 | 35669.385281 | 462 | classic | 107870 | 8754 | 4.070 | 0.035561 | female | 639.0 | False | False |
| 8 | 9265 | 39576 | 12 | 3298.0 | 522 | 5602.766667 | 8694.995334 | 64095.143079 | 643 | junior | 1204953 | 12541 | 0.360 | 0.076677 | female | 170.0 | False | False |
| 9 | 8051 | 208320 | 48 | 4340.0 | 479 | 4340.000000 | 9533.831731 | 41532.387019 | 416 | No | 1204953 | 12541 | 0.360 | 0.076677 | female | 313.0 | False | False |
Last rows
| account_id | amount | duration | payments | DID | Average_order_amount | Average_trans_amount | Average_trans_balance | No_transaction | Card_type | No_inhabitants | Average_salary | Average_unemployment_rate | Average_crime_rate | gender | Owner_age | Same_district | Default | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 672 | 105 | 352704 | 48 | 7348.0 | 513 | 7348.000 | 8368.847458 | 30200.627119 | 59 | classic | 103347 | 9104 | 1.790 | 0.022512 | female | 575.0 | True | True |
| 673 | 11317 | 317460 | 60 | 5291.0 | 499 | 3974.600 | 13987.632258 | 66468.587097 | 155 | No | 102609 | 8187 | 5.140 | 0.020744 | male | 276.0 | True | True |
| 674 | 3293 | 276084 | 36 | 7669.0 | 476 | 2349.140 | 10670.812903 | 59131.412903 | 155 | classic | 1204953 | 12541 | 0.360 | 0.076677 | female | 416.0 | True | True |
| 675 | 37 | 318480 | 60 | 5308.0 | 422 | 2576.375 | 7293.530769 | 37547.484615 | 130 | No | 70646 | 8547 | 3.145 | 0.021976 | male | 553.0 | True | False |
| 676 | 9140 | 16032 | 48 | 334.0 | 449 | 2418.250 | 3636.251852 | 22821.851852 | 135 | No | 105058 | 9272 | 3.010 | 0.042096 | female | 395.0 | True | True |
| 677 | 5698 | 99216 | 36 | 2756.0 | 443 | 6438.000 | 12802.797980 | 58031.555556 | 99 | No | 1204953 | 12541 | 0.360 | 0.076677 | female | 572.0 | True | True |
| 678 | 6505 | 38496 | 12 | 3208.0 | 390 | 3207.800 | 7720.367347 | 45556.530612 | 49 | classic | 94725 | 9920 | 2.565 | 0.048219 | female | 416.0 | True | True |
| 679 | 9156 | 163332 | 36 | 4537.0 | 443 | 4537.300 | 8251.597222 | 28793.458333 | 72 | No | 70699 | 8968 | 3.090 | 0.025814 | male | 731.0 | True | True |
| 680 | 276 | 160920 | 36 | 4470.0 | 359 | 4841.250 | 11168.215385 | 50293.476923 | 65 | classic | 85852 | 8965 | 7.655 | 0.031886 | male | 325.0 | True | True |
| 681 | 1318 | 185544 | 36 | 5154.0 | 293 | 3670.400 | 6633.824561 | 22717.736842 | 57 | No | 124605 | 8772 | 4.835 | 0.024124 | male | 708.0 | True | True |